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Abstract 

Thermal unfolding of proteins is compared to folding and mechanical stretch- 
ing in a simple topology-based dynamical model. We define the unfolding 
time and demonstrate its low-temperature divergence. Below a characteristic 
temperature, contacts break at separate time scales and unfolding proceeds 
approximately in a way reverse to folding. Features in these scenarios agree 
with experiments and atomic simulations on titin. 
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Conformational changes in proteins occur in a variety of processes such as folding, me- 
chanically induced stretching, chemical denaturation, and thermally induced unfolding. 
Standard molecular dynamics simulations of these processes cover nanosecond intervals 
which usually misses the relevant time scales by at least six orders of magnitude. One 
may gain insights into the long time scale conformational dynamics by considering effective 
coarse-grained models. Among these, the simplest and yet often succesful are the topol- 
ogy based models. They are constructed based on the knowledge of the experimentally 
established native conformations [1,2]. The topology based models offer a possibility to 
study various processes within one unified approach, an opportunity to explore relationships 
between them, and an easy way to determine dependence on parameters, such as the tem- 
perature. 

In this Letter, we focus on thermal unfolding. This phenomenon is often invoked in the- 
oretical searches for a molecular interpretation of the transition state for the folding process 
[3]. The transition state is probed experimentally through the protein-engineering-based 
so called 0- value analysis [4]. The assumptions underlying the theoretical search for the 
transition state by simulating thermal unfolding are that unfolding should proceed in a way 
that is reverse to folding and that the transition state should be quickly accessible from 
the native state, especially if high temperatures are applied (even up to 200°C which in 
itself may, however, alter the free energy landscape of a protein significantly). Additional 
assumptions attempt to relate the transition state to "large structural changes" [5,6] in an 
unfolding evolution of a protein - a point recently assessed in Ref. [7]. Here, we analyse 
thermal unfolding within the topology-based model as implemented in Refs. [8] and [9]. 
We characterize unfolding at various temperatures by determining unfolding times and by 
providing scenarios of unfolding. We show that there is a characteristic temperature, Tq, 
associated with unfolding above which rupturing of bonds occurs simultaneously (on an av- 
erage) at all sequential separations. We show that the unfolding times diverge on lowering 
the temperature and that below Tq, the unfolding process runs in reverse to folding as mon- 
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itored at an optimal folding temperature. Some of our predictions regarding scenarious of 
the conformational changes are found to be consistent with experimental findings. 

We consider several model proteins with a special emphasis on the 127 globular domain 
of the muscle protein titin (the Protein Data Bank [10] code ltit). Mechanical stretching of 
this protein has been extensively studied in experiments involving atomic force microscopy 
[11-13] and there is also some information about its folding [14,15]. Furthermore, we have 
already studied it undergoing both processes through molecular dynamics simulations within 
the topology based model [9,16-18]. 

This model can be outlined as follows. The protein is represented by the C a atoms that 
are tethered by harmonic potentials with minima at 3.8 A. The native contacts are described 

, where the length parameters 
are chosen so that the potential minima correspond, pair-by-pair, to the native distances be- 
tween the C a atoms % and j. Which amino acids form native contacts is determined through 
atomic overlaps as described by Tsai et al. [19]. The non-native contacts are described by 
repulsive cores of a = 5 A. The energy parameter, e, is taken to be uniform and its effective 
value appears to be of order 900 K, at least for titin. The optimal folding temperature, T min , 
for 127 has been found to correspond to the reduced temperature T = k B T/e of 0.275 [9] 
(ks is the Boltzmann constant and T is temperature) which is close to the room tempera- 
ture value of T=0.3. In our stretching simulations, both ends of the protein are attached to 
harmonic springs of elastic constant A;=0.12 e/A 2 which is close to the values corresponding 
to the elasticity of experimental cantilevers. The free end of one of the two springs is con- 
strained while the free end of the second spring is pulled at constant speed, v p , along the 
initial end-to-end position vector. We focus on v p of 0.005 A/r, where r = ^Jma 2 /e px 3ps 
is the characteristic time for the Lennard- Jones potentials. Here, a = 5A is a typical value 
of <7y and m is the average mass of the amino acids. Thermostating is provided by the 
Langevin noise which also mimics random kicks by the implicit solvent. An equation of 
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by the Lennard- Jones potentials = 4e 
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motion for each C a reads mr = —71- + F c + T, where F c is the net force on an atom due 
to the molecular potentials. The damping constant 7 is taken to be equal to 2m/ t and 
the dispersion of the random forces is equal to ^2jIcbT. This choice of 7 corresponds to 
a situation in which the inertial effects are negligible [8] but the damping action is not yet 
as strong as in water. Increasing 7 twentyfold results in a twentyfold increase in the time 
scales bringing the typical value of v p within two orders of magnitude of the experimental 
pulling speeds [9] and correspondingly longer folding times [8] . The equations of motion are 
solved by a fifth order predictor-corrector scheme. 

The top two panels of Figure 1 illustrate what happens to distances between two pairs 
of amino acids that make native contacts when submitting the 127 domain of titin to a 
very high reduced temperature of 1.1. The broken line corresponds to the distance of 1.5(7^ 
that is considered as a qualitative treshold for the amino acids staying or not staying in a 
contact [8,16]. It is seen that the treshold line is being crossed repeatedly due to thermal 
fluctuations. However, there is a well defined and pair-specific time t u at which the contact 
breaks for good, at least within the unfolding time, t^ that is defined by the requirement 
that all non-local contacts are broken. Specifically, the non-locality refers to the sequencial 
distance \j — i\ > I, where I = 4 (non- helical). The example values of t u are indicated in 
the top panels of Figure 1. The unfolding scenarios may be defined in terms of a list of the 
times t u that are averaged over several hundred different trajectories. 

The values of t u vary across the trajectories and we define tn as their median (a sim- 
ple average would be ill-defined if there was no unfolding within a cutoff duration in the 
time evolution). The lower left panel of Figure 1 shows the temperature dependence of tn 
for three model proteins including the 127 domain of titin. On lowering the T, tn grows 
rapidly, faster than according to the Arrhenius law, suggesting perhaps a Vogel-Fulcher-like 
divergence at a finite T . However, a finite system such as a single protein can give rise 
to a divergence only at T—0. This statement also applies to folding times which generally 

4 



have a U-shaped temperature dependence with divergences at zero and infinity with the 
former being Arrhenius-like [20,8] (mean field theories may lead to different conclusions). 
The T-dependence of tci appears to be consistent, to a leading order with the exp(A/T 2 ) 
law. Interpretation of this law, and corrections to it, remain to be elucidated. Our data do 
not rule out a power law divergence either. 

We have found that varying the parameter / between 4 and 10 affects tn insignificantly. 
However reducing / below 4 results in a substantially different tn as shown in the lower right 
panel of Figure 1 for the lbba protein for I = 2. This suggests a physical relevance of / = 4 
for distinguishing between local and non-local contacts [21]. Considering / smaller than 4 is 
impractical computationally for proteins that are bigger than lbba, and appears to have no 
justification in the chemical denaturation [22]. The lower right panel of Figure 1 shows that 
the divergence of the unfolding time also applies to secondary structures. In that theoretical 
case, / = 2 is a more sensible choice to take. 

There are characteristic temperatures that are associated with the processes of folding 
and stretching. In the case of folding, it is the temperature of kinetic optimality (0.275 
for ltit). In the case of stretching, it is the temperature (0.8 for ltit) at which the purely 
entropic behavior [23] sets in: above it, force peaks disappear and the system acquires the 
worm-like-chain behavior [24]. Is there a characteristic temperature, T n that can be associ- 
ated with thermal unfolding? Figure 2 suggests that there indeed such a temperature exists 
and its reduced value for titin is around 1.1. Below this temperature, the median and the 
most probable unfolding times diverge from each other significantly, indicating emergence 
of a broad distribution of time scales and temporal separation of the unfolding events. 

The left top panel of Figure 3 shows the average scenarios of the unfolding events in 
titin at two temperatures: at T n and substantially below it, i.e., at T=0.85. These scenarios 
show the average unfolding times, tu =< t u > of specific contacts. These times are plotted 
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against the contact order, i.e., against the sequential distance \j — i\. In order to see the 
details in the scenarios, we actually plot t\j — t n which removes the dominant time scale. 
We observe that at and above Tq, the thermal fluctuations destroy bonds nearly simultane- 
ously, independent of the contact order. On the other hand, below Th, the scenarios acquire 
reproducible structures in which the contacts between strands C and F (solid squares) and 
between strands A and G (solid circles) disintegrate much sooner, on an average, than those 
between strands B-E (open triangles) and B-G (open circles). This order of events agrees 
with all-atom nanosecond long unfolding simulations of titin by Paci and Karplus [25] in 
which several trajectories were studied at 450 K. 

In broad features, the unfolding scenario at T — 0.85 runs in reverse to the folding sce- 
nario at T min shown in the top right panel of Figure 3. However, the cross correlation plot 
between the two scenarios, shown in the bottom left panel of Figure 3, indicates that the 
two processes are not simply unticorrelated but merely reflecting the time flow of events in 
a fairly monotonic fashion. It is interesting to note that the events which are most relevant 
to the search of the transition state - the final stages of folding and the initial stages of 
unfolding - anticorrelate in a nearly linear way. This point qualitatively agrees with all- 
atom simulations for the /5-hairpin fragment of protein G [3] that were performed between 
certain characteristic sets of conformations (16 amino acids were considered, the unfolding 
simulations took place at 350 K). 

The folding scenario shown in the top right panel of Figure 3 is defined in terms of av- 
erage times, t c , at which specific contacts are established for the first time [8]. (Folding is 
considered to be fully accomplished when all contacts are simultaneously established for the 
first time). We have found that folding in the model titin takes place in two channels. In 
the first channel, comprising about 24% of the trajectories, the C-F contacts are established 
in twice as long a time as needed to set the A-G and A'-G contacts. In the second channel, 
comprising the remaining 76% of the trajectories (at the temperature of optimal folding), 
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C-F gets established somewhat earlier than the A-G contacts. The scenario shown in Fig- 
ure 11 of Reference [9] combines the two channels. The scenario shown here discards the 
minority channel. Our studies of a generalized model of titin, in which the O 3 atoms are 
also included in the description of the model, agrees qualitatively with the majority-channel 
scenario. This updated scenario is consistent with the 0-value data [14,15]. 

Finaly, we consider mechanical stretching of ltit at constant speed. Its scenario is de- 
fined in terms of the last average distance, d u , at which contacts are still holding when 
the C-terminus is moving at a constant speed and the N-terminus is attached to an elastic 
anchor. We have already established that stretching at "room temperature" proceeds in a 
way that is unrelated to folding [16,9]. It is only in the entropic limit, when stretching is 
governed exclusively by the sequential distance that stretching is approximately reverse to 
folding at optimality [23]. The right bottom panel of Figure 4 shows that when unfolding 
and stretching are both done at T — 0.85 they follow each other in a monotonic way. The 
order of events is nearly identical but the time intervals between them do not scale linearly 
except perhaps at the very begining of the two processes. The reduced temperature of 0.85 
belongs to the entropic regime but, at the same time, it is below Tq. The inset of this 
panel shows, however, that stretching at at the "room temperature" value of T=0.3 does 
not correlate with unfolding at T=0.85 at all. 

In summary, we have provided an operational definition of the unfolding times, demon- 
strated their "low" -temperature (faster than Arrhenius) divergence and indicated existence 
of a characteristic temperature below which unfolding scenarios have contact-order-related 
structure and time scales become broadly distributed. We have demonstrated that long time 
folding events are anticorrelated with the short time unfolding events. We find that the sim- 
ple topology-based dynamical models qualitatively capture what is known from experiments 
and simulations about the average order in which conformational changes proceed in titin. 
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FIGURE CAPTIONS 



Fig. 1. The top panels show examples of evolution of the distance, g^, between two con- 
tact making amino acids i and j in the 127 (ltit) domain of titin when starting from 
the native structure and then applying a temperature of T—l.l. The contacts involved 
are between the /3-strands C, F and B, G. The A, A', B, C, D, E, F, and G strands 
in titin correspond to the sequential segments 4-7, 11-15, 18-25, 32-36, 47-52, 55-61, 
69-75, and 78-88 respectively. This protein consists of altogether 89 amino acids. The 
lower panels show t n for the systems indicated. The lower left (right) panel refers to 
calculations done with the I — 4 (I — 2) criterion. The data are generally based on 
at least 201 trajectories; above T of 0.8 - on at least 501 trajectories. There are two 
data sets for titin. The solid symbols correspond to the Go-like model discussed in 
this paper whereas the open symbols correspond to a generalized Go-like model with 
side groups in which the degrees of freedom related to the O 3 atoms are included. 
The generalized model shows a similar behavior. The lines in the lower panel illus- 
trate fits to the tn/r = exp(A/f 2 - B/f + C) law, where the sets (A, B,C) are 
(10.592,10.492,6.976) for ltit, (8.381,9.887,6.258) for lcrn, (1.391,0.943,1.548) for lbba 
with Z=4, (5.259,1.576,4.916) for lbba with 1=2, (2.329,0.671,0.659) for the hairpin, 
and (3.363,2.143,2.585) for the helix. The fitting confidence level is at least 0.987. 
Somewhat poorer fits were obtained by using the exp(exp(D/T) law. The data points 
for the C Q -based model can also be superficially fitted to the Vogel-Fulcher-like diver- 
gences with the apparent T of 0.56, 0.44, and 0.20 and with the energy barriers of 2.6 
2.3 and 1.5 for ltit, lcrn, and lbba respectively. 

Fig. 2. The distribution of unfolding times for ltit for the three temperatures indicated. 
The arrows point at the median values. The inset in the middle panel shows the 
temperature dependence of the difference between the median and peak values in the 
distributions. 
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3. The top left and right panels show the unfolding and folding scenarios in ltit re- 
spectively. The data are averaged over 501 trajectories. The unfolding scenarios are 
shown for the two temperatures indicated. The broken line separates the data points 
obtained for the two temperatures. The folding scenario corresponds to the temper- 
ature of the fastest folding. The symbols assigned to specific contacts are the same 
in both panels. Open circles, open triangles, open pentagons, solid circles, and solid 
squares correspond to contacts B-G, B-E, D-E, A-G, and C-F respectively. The stars 
denote all other contacts. The lower left panel cross-plots the folding scenario with 
the unfolding scenario at the temperatures indicated. The lower right panel cross- 
plots events of mechanical stretching with those of thermal unfolding. The mechanical 
stretching in the inset and the main panel is performed at T = 0.3 and 0.85 respec- 
tively. 
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